A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique

نویسندگان

  • Rui Wang
  • Hai Zhao
  • Sabine Ploux
  • Bao-Liang Lu
  • Masao Utiyama
چکیده

Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Bilingual Representation of the Roget Thesaurus for French to English Machine Translation

This paper describes a solution to lexical transfer as a trade-off between a dictionary and an ontology. It shows its association to a translation tool based on morpho-syntactical parsing of the source language. It is based on the English Roget Thesaurus and its equivalent, the French Larousse Thesaurus, in a computational framework. Both thesaurii are transformed into vector spaces, and all mo...

متن کامل

language development and lexical awareness of bilingual (Azeri -Persian) hard of hearing impaired children

The Relationship between Mean Length of utterance (MLU), Lexical Richness and syntactical and lexical metalinguistic Awareness in Bilingual (Turkish-Persian) normal and hearing impaired Children   Objectives: Regarding the impact of hearing loss on language development and metalinguistic skill and being language development different from metalinguistic skill in bilingual children, studying of...

متن کامل

Building Bilingual Lexicons using Lexical Translation Probabilities via Pivot Languages

This paper proposes a method of increasing the size of a bilingual lexicon obtained from two other bilingual lexicons via a pivot language. When we apply this approach, there are two main challenges, ambiguity and mismatch of terms; we target the latter problem by improving the utilization ratio of the bilingual lexicons. Given two bilingual lexicons between language pairs Lf –Lp and Lp–Le, we ...

متن کامل

LIHLA: A lexical aligner based on language-independent heuristics

Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based machine translation, transfer rule learning for machine translation, bilingual lexicography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic lexicons generated by a freely available set of too...

متن کامل

Evaluating the LIHLA lexical aligner on Spanish, Brazilian Portuguese and Basque parallel texts

Alignment of words and multiword units plays an important role in many natural language processing applications, such as example-based machine translation, transfer rule learning for machine translation, bilingual lexicography, word sense disambiguation, etc. In this paper we describe LIHLA, a lexical aligner which uses bilingual probabilistic lexicons generated by a freely available set of too...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1607.08692  شماره 

صفحات  -

تاریخ انتشار 2016